An Unsupervised Parameter Estimation Algorithm for a Generative Dependency N-gram Language Model

نویسندگان

  • Chenchen Ding
  • Mikio Yamamoto
چکیده

We design a language model based on a generative dependency structure for sentences. The parameter of the model is the probability of a dependency N-gram, which is composed of lexical words with four kinds of extra tags used to model the dependency relation and valence. We further propose an unsupervised expectationmaximization algorithm for parameter estimation, in which all possible dependency structures of a sentence are considered. As the algorithm is language-independent, it can be used on a raw corpus from any language, without any part-of-speech annotation, tree-bank or trained parser. We conducted experiments using four languages: English, German, Spanish and Japanese. The results illustrate the applicability and the properties of the proposed approach.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Generative Dependency N-gram Language Model: Unsupervised Parameter Estimation and Application

We design a language model based on a generative dependency structure for sentences. The parameter of the model is the probability of a dependency N-gram, which is composed of lexical words with four types of extra tag used to model the dependency relation and valence. We further propose an unsupervised expectation-maximization algorithm for parameter estimation, in which all possible dependenc...

متن کامل

Unsupervised Learning of Hierarchical Dependency Parsing

In a sentence, a dependency is a link from one word, called a dependent or attachment, to another, the head. If we imagine that the global head word itself connects to some root note, then a sentence of length n corresponds to a parse tree with exactly n links. Constituency parses, such as those in the Wall Street Journal section of the Penn Tree-Bank, can be converted to dependency parses usin...

متن کامل

Unsupervised Bayesian Parameter Estimation for Dependency Parsing

We explore a new Bayesian model for probabilistic grammars, a family of distributions over discrete structures that includes hidden Markov models and probabilitsic context-free grammars. Our model extends the correlated topic model framework to probabilistic grammars, exploiting the logistic normal prior as a prior over the grammar parameters. We derive a variational EM algorithm for that model...

متن کامل

Unsupervised Learning of Dependency Structure for Language Modeling

This paper presents a dependency language model (DLM) that captures linguistic constraints via a dependency structure, i.e., a set of probabilistic dependencies that express the relations between headwords of each phrase in a sentence by an acyclic, planar, undirected graph. Our contributions are three-fold. First, we incorporate the dependency structure into an n-gram language model to capture...

متن کامل

Generative Incremental Dependency Parsing with Neural Networks

We propose a neural network model for scalable generative transition-based dependency parsing. A probability distribution over both sentences and transition sequences is parameterised by a feedforward neural network. The model surpasses the accuracy and speed of previous generative dependency parsers, reaching 91.1% UAS. Perplexity results show a strong improvement over n-gram language models, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013